Overview

Dataset statistics

Number of variables11
Number of observations18905
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory2.2 MiB
Average record size in memory124.0 B

Variable types

Numeric10
Categorical1

Alerts

fLength is highly overall correlated with fWidth and 7 other fieldsHigh correlation
fWidth is highly overall correlated with fLength and 6 other fieldsHigh correlation
fSize is highly overall correlated with fLength and 4 other fieldsHigh correlation
fConc is highly overall correlated with fLength and 4 other fieldsHigh correlation
fConc1 is highly overall correlated with fLength and 4 other fieldsHigh correlation
fAsym is highly overall correlated with fLength and 2 other fieldsHigh correlation
fM3Long is highly overall correlated with fLength and 5 other fieldsHigh correlation
fM3Trans is highly overall correlated with fLength and 1 other fieldsHigh correlation
fAlpha is highly overall correlated with classHigh correlation
fDist is highly overall correlated with fLengthHigh correlation
class is highly overall correlated with fAlphaHigh correlation

Reproduction

Analysis started2022-11-26 18:14:43.929646
Analysis finished2022-11-26 18:15:18.931535
Duration35 seconds
Software versionpandas-profiling vv3.5.0
Download configurationconfig.json

Variables

fLength
Real number (ℝ)

Distinct18643
Distinct (%)98.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean53.161416
Minimum4.2835
Maximum334.177
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size295.4 KiB
2022-11-26T13:15:19.177359image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Quantile statistics

Minimum4.2835
5-th percentile16.43074
Q124.3597
median37.1295
Q369.9754
95-th percentile139.1416
Maximum334.177
Range329.8935
Interquartile range (IQR)45.6157

Descriptive statistics

Standard deviation42.259789
Coefficient of variation (CV)0.79493348
Kurtosis5.031315
Mean53.161416
Median Absolute Deviation (MAD)16.288
Skewness2.0219617
Sum1005016.6
Variance1785.8898
MonotonicityNot monotonic
2022-11-26T13:15:19.570061image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
19.1572 3
 
< 0.1%
24.8332 3
 
< 0.1%
26.9187 3
 
< 0.1%
31.3405 2
 
< 0.1%
12.9833 2
 
< 0.1%
61.6736 2
 
< 0.1%
20.1648 2
 
< 0.1%
21.0734 2
 
< 0.1%
20.9469 2
 
< 0.1%
24.5405 2
 
< 0.1%
Other values (18633) 18882
99.9%
ValueCountFrequency (%)
4.2835 1
< 0.1%
7.2079 1
< 0.1%
7.3606 1
< 0.1%
8.0518 1
< 0.1%
8.2304 1
< 0.1%
8.2311 1
< 0.1%
8.4802 1
< 0.1%
8.5738 1
< 0.1%
8.601 1
< 0.1%
8.6998 1
< 0.1%
ValueCountFrequency (%)
334.177 1
< 0.1%
310.61 1
< 0.1%
305.422 1
< 0.1%
305.324 1
< 0.1%
305.0961 1
< 0.1%
303.5676 1
< 0.1%
303.2787 1
< 0.1%
299.9304 1
< 0.1%
297.1239 1
< 0.1%
295.672 1
< 0.1%

fWidth
Real number (ℝ)

Distinct18200
Distinct (%)96.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean22.145872
Minimum0
Maximum256.382
Zeros98
Zeros (%)0.5%
Negative0
Negative (%)0.0%
Memory size295.4 KiB
2022-11-26T13:15:20.027906image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile7.398
Q111.8742
median17.1438
Q324.7124
95-th percentile58.34608
Maximum256.382
Range256.382
Interquartile range (IQR)12.8382

Descriptive statistics

Standard deviation18.300664
Coefficient of variation (CV)0.82636909
Kurtosis17.013498
Mean22.145872
Median Absolute Deviation (MAD)5.8606
Skewness3.3945407
Sum418667.71
Variance334.9143
MonotonicityNot monotonic
2022-11-26T13:15:20.504486image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 98
 
0.5%
10.7539 4
 
< 0.1%
15.8644 3
 
< 0.1%
0.0028 3
 
< 0.1%
15.0295 3
 
< 0.1%
20.2021 3
 
< 0.1%
0.0029 3
 
< 0.1%
10.5084 3
 
< 0.1%
10.0342 3
 
< 0.1%
12.8155 3
 
< 0.1%
Other values (18190) 18779
99.3%
ValueCountFrequency (%)
0 98
0.5%
0.0001 3
 
< 0.1%
0.0002 1
 
< 0.1%
0.0006 1
 
< 0.1%
0.0019 1
 
< 0.1%
0.0025 2
 
< 0.1%
0.0026 2
 
< 0.1%
0.0027 1
 
< 0.1%
0.0028 3
 
< 0.1%
0.0029 3
 
< 0.1%
ValueCountFrequency (%)
256.382 1
< 0.1%
228.0385 1
< 0.1%
220.5144 1
< 0.1%
201.364 1
< 0.1%
190.5432 1
< 0.1%
190.139 1
< 0.1%
188.8866 1
< 0.1%
186.928 1
< 0.1%
179.2924 1
< 0.1%
177.782 1
< 0.1%

fSize
Real number (ℝ)

Distinct7228
Distinct (%)38.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.8246431
Minimum1.9413
Maximum5.3233
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size295.4 KiB
2022-11-26T13:15:20.951035image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Quantile statistics

Minimum1.9413
5-th percentile2.19418
Q12.4771
median2.74
Q33.1011
95-th percentile3.71468
Maximum5.3233
Range3.382
Interquartile range (IQR)0.624

Descriptive statistics

Standard deviation0.4723766
Coefficient of variation (CV)0.16723408
Kurtosis0.72315407
Mean2.8246431
Median Absolute Deviation (MAD)0.2992
Skewness0.87304282
Sum53399.878
Variance0.22313965
MonotonicityNot monotonic
2022-11-26T13:15:21.501118image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2.1508 27
 
0.1%
2.0774 24
 
0.1%
2.1287 24
 
0.1%
2.1319 23
 
0.1%
2.3139 22
 
0.1%
2.1414 22
 
0.1%
2.1351 22
 
0.1%
2.29 21
 
0.1%
2.1717 20
 
0.1%
2.3483 20
 
0.1%
Other values (7218) 18680
98.8%
ValueCountFrequency (%)
1.9413 1
 
< 0.1%
1.9468 1
 
< 0.1%
1.9916 1
 
< 0.1%
1.9978 1
 
< 0.1%
2.0022 1
 
< 0.1%
2.0065 2
 
< 0.1%
2.0107 3
 
< 0.1%
2.0149 4
< 0.1%
2.0191 1
 
< 0.1%
2.0233 8
< 0.1%
ValueCountFrequency (%)
5.3233 1
< 0.1%
5.1795 1
< 0.1%
5.1467 1
< 0.1%
5.0118 1
< 0.1%
5.01 1
< 0.1%
4.9946 1
< 0.1%
4.9518 1
< 0.1%
4.9369 1
< 0.1%
4.905 1
< 0.1%
4.8501 1
< 0.1%

fConc
Real number (ℝ)

Distinct6410
Distinct (%)33.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.38024713
Minimum0.0131
Maximum0.893
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size295.4 KiB
2022-11-26T13:15:21.963768image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Quantile statistics

Minimum0.0131
5-th percentile0.12652
Q10.2358
median0.354
Q30.5035
95-th percentile0.7342
Maximum0.893
Range0.8799
Interquartile range (IQR)0.2677

Descriptive statistics

Standard deviation0.18270933
Coefficient of variation (CV)0.48050155
Kurtosis-0.51687105
Mean0.38024713
Median Absolute Deviation (MAD)0.13
Skewness0.48853997
Sum7188.5719
Variance0.033382701
MonotonicityNot monotonic
2022-11-26T13:15:22.365246image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.6 15
 
0.1%
0.4116 12
 
0.1%
0.2979 12
 
0.1%
0.4 12
 
0.1%
0.2214 11
 
0.1%
0.2175 11
 
0.1%
0.5 11
 
0.1%
0.6154 11
 
0.1%
0.193 11
 
0.1%
0.2802 10
 
0.1%
Other values (6400) 18789
99.4%
ValueCountFrequency (%)
0.0131 1
< 0.1%
0.0133 1
< 0.1%
0.0137 1
< 0.1%
0.0139 2
< 0.1%
0.0158 1
< 0.1%
0.0162 1
< 0.1%
0.0171 1
< 0.1%
0.0188 1
< 0.1%
0.0196 1
< 0.1%
0.0206 1
< 0.1%
ValueCountFrequency (%)
0.893 1
< 0.1%
0.8912 1
< 0.1%
0.8889 1
< 0.1%
0.8846 1
< 0.1%
0.8786 1
< 0.1%
0.8778 1
< 0.1%
0.8772 1
< 0.1%
0.8757 1
< 0.1%
0.8745 1
< 0.1%
0.8743 1
< 0.1%

fConc1
Real number (ℝ)

Distinct4421
Distinct (%)23.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.21455974
Minimum0.0003
Maximum0.6752
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size295.4 KiB
2022-11-26T13:15:22.728096image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Quantile statistics

Minimum0.0003
5-th percentile0.0671
Q10.1285
median0.1964
Q30.285
95-th percentile0.42208
Maximum0.6752
Range0.6749
Interquartile range (IQR)0.1565

Descriptive statistics

Standard deviation0.11038355
Coefficient of variation (CV)0.51446536
Kurtosis0.031125523
Mean0.21455974
Median Absolute Deviation (MAD)0.0753
Skewness0.68688736
Sum4056.2518
Variance0.012184528
MonotonicityNot monotonic
2022-11-26T13:15:23.035414image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.194 18
 
0.1%
0.1939 16
 
0.1%
0.2126 16
 
0.1%
0.2 16
 
0.1%
0.217 15
 
0.1%
0.2251 15
 
0.1%
0.1581 14
 
0.1%
0.1279 14
 
0.1%
0.1772 14
 
0.1%
0.1245 14
 
0.1%
Other values (4411) 18753
99.2%
ValueCountFrequency (%)
0.0003 1
< 0.1%
0.0008 1
< 0.1%
0.0011 1
< 0.1%
0.0015 1
< 0.1%
0.002 1
< 0.1%
0.0047 1
< 0.1%
0.005 1
< 0.1%
0.0072 1
< 0.1%
0.0073 1
< 0.1%
0.0076 1
< 0.1%
ValueCountFrequency (%)
0.6752 1
< 0.1%
0.674 1
< 0.1%
0.643 1
< 0.1%
0.637 1
< 0.1%
0.6296 1
< 0.1%
0.6283 1
< 0.1%
0.6264 1
< 0.1%
0.6242 1
< 0.1%
0.6224 1
< 0.1%
0.6204 1
< 0.1%

fAsym
Real number (ℝ)

Distinct18704
Distinct (%)98.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean-4.177867
Minimum-457.9161
Maximum575.2407
Zeros40
Zeros (%)0.2%
Negative8380
Negative (%)44.3%
Memory size295.4 KiB
2022-11-26T13:15:23.772323image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Quantile statistics

Minimum-457.9161
5-th percentile-110.8296
Q1-20.4791
median4.0629
Q324.1335
95-th percentile65.52956
Maximum575.2407
Range1033.1568
Interquartile range (IQR)44.6126

Descriptive statistics

Standard deviation59.010059
Coefficient of variation (CV)-14.124447
Kurtosis8.2314395
Mean-4.177867
Median Absolute Deviation (MAD)21.6645
Skewness-1.0379255
Sum-78982.575
Variance3482.1871
MonotonicityNot monotonic
2022-11-26T13:15:24.132950image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 40
 
0.2%
-0.0001 7
 
< 0.1%
8.8077 3
 
< 0.1%
-0.5062 3
 
< 0.1%
-1.4761 3
 
< 0.1%
5.2783 2
 
< 0.1%
21.8701 2
 
< 0.1%
-14.7987 2
 
< 0.1%
-1.2309 2
 
< 0.1%
-25.9338 2
 
< 0.1%
Other values (18694) 18839
99.7%
ValueCountFrequency (%)
-457.9161 1
< 0.1%
-449.9526 1
< 0.1%
-382.594 1
< 0.1%
-381.734 1
< 0.1%
-378.9457 1
< 0.1%
-368.633 1
< 0.1%
-363.3382 1
< 0.1%
-353.934 1
< 0.1%
-353.26 1
< 0.1%
-349.757 1
< 0.1%
ValueCountFrequency (%)
575.2407 1
< 0.1%
473.0654 1
< 0.1%
464.631 1
< 0.1%
444.401 1
< 0.1%
433.0957 1
< 0.1%
402.925 1
< 0.1%
402.1863 1
< 0.1%
400.284 1
< 0.1%
396.3379 1
< 0.1%
384.3477 1
< 0.1%

fM3Long
Real number (ℝ)

Distinct18693
Distinct (%)98.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean10.618826
Minimum-331.78
Maximum238.321
Zeros39
Zeros (%)0.2%
Negative6558
Negative (%)34.7%
Memory size295.4 KiB
2022-11-26T13:15:24.451188image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Quantile statistics

Minimum-331.78
5-th percentile-79.41338
Q1-12.7693
median15.338
Q335.8694
95-th percentile82.96334
Maximum238.321
Range570.101
Interquartile range (IQR)48.6387

Descriptive statistics

Standard deviation50.900687
Coefficient of variation (CV)4.7934381
Kurtosis4.7170806
Mean10.618826
Median Absolute Deviation (MAD)25.3311
Skewness-1.1296137
Sum200748.91
Variance2590.88
MonotonicityNot monotonic
2022-11-26T13:15:24.748679image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 39
 
0.2%
-0.0001 4
 
< 0.1%
16.0747 3
 
< 0.1%
55.3976 2
 
< 0.1%
-19.9547 2
 
< 0.1%
10.6638 2
 
< 0.1%
9.164 2
 
< 0.1%
40.4022 2
 
< 0.1%
20.2691 2
 
< 0.1%
19.839 2
 
< 0.1%
Other values (18683) 18845
99.7%
ValueCountFrequency (%)
-331.78 1
< 0.1%
-318.3002 1
< 0.1%
-297.1717 1
< 0.1%
-293.1762 1
< 0.1%
-287.5067 1
< 0.1%
-287.3636 1
< 0.1%
-284.7038 1
< 0.1%
-281.9541 1
< 0.1%
-281.844 1
< 0.1%
-281.435 1
< 0.1%
ValueCountFrequency (%)
238.321 1
< 0.1%
231.446 1
< 0.1%
227.8174 1
< 0.1%
226.3506 1
< 0.1%
222.417 1
< 0.1%
217.934 1
< 0.1%
217.624 1
< 0.1%
216.985 1
< 0.1%
215.894 1
< 0.1%
203.863 1
< 0.1%

fM3Trans
Real number (ℝ)

Distinct18390
Distinct (%)97.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.25936414
Minimum-205.8947
Maximum179.851
Zeros59
Zeros (%)0.3%
Negative9343
Negative (%)49.4%
Memory size295.4 KiB
2022-11-26T13:15:25.097775image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Quantile statistics

Minimum-205.8947
5-th percentile-25.63852
Q1-10.8358
median0.75
Q310.9489
95-th percentile26.89144
Maximum179.851
Range385.7457
Interquartile range (IQR)21.7847

Descriptive statistics

Standard deviation20.775268
Coefficient of variation (CV)80.100773
Kurtosis8.6759437
Mean0.25936414
Median Absolute Deviation (MAD)10.8861
Skewness0.12358922
Sum4903.2791
Variance431.61177
MonotonicityNot monotonic
2022-11-26T13:15:25.404875image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 59
 
0.3%
-0.0001 24
 
0.1%
0.0001 18
 
0.1%
11.1602 3
 
< 0.1%
-8.975 3
 
< 0.1%
-5.4454 3
 
< 0.1%
8.3039 2
 
< 0.1%
14.2184 2
 
< 0.1%
-9.7914 2
 
< 0.1%
11.8212 2
 
< 0.1%
Other values (18380) 18787
99.4%
ValueCountFrequency (%)
-205.8947 1
< 0.1%
-164.14 1
< 0.1%
-149.5513 1
< 0.1%
-142.5894 1
< 0.1%
-142.119 1
< 0.1%
-135.5051 1
< 0.1%
-134.75 1
< 0.1%
-134.395 1
< 0.1%
-133.1359 1
< 0.1%
-132.416 1
< 0.1%
ValueCountFrequency (%)
179.851 1
< 0.1%
170.692 1
< 0.1%
163.2697 1
< 0.1%
154.865 1
< 0.1%
143.8753 1
< 0.1%
139.2361 1
< 0.1%
132.589 1
< 0.1%
132.388 1
< 0.1%
131.5547 1
< 0.1%
130.8545 1
< 0.1%

fAlpha
Real number (ℝ)

Distinct17981
Distinct (%)95.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean27.551644
Minimum0
Maximum90
Zeros5
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size295.4 KiB
2022-11-26T13:15:25.730113image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0.92724
Q15.5164
median17.533
Q345.704
95-th percentile80.71102
Maximum90
Range90
Interquartile range (IQR)40.1876

Descriptive statistics

Standard deviation26.083055
Coefficient of variation (CV)0.94669687
Kurtosis-0.52103306
Mean27.551644
Median Absolute Deviation (MAD)14.569
Skewness0.85745627
Sum520863.83
Variance680.32577
MonotonicityNot monotonic
2022-11-26T13:15:26.044641image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.0002 7
 
< 0.1%
0 5
 
< 0.1%
0.256 4
 
< 0.1%
2.701 4
 
< 0.1%
1.29 4
 
< 0.1%
0.804 4
 
< 0.1%
0.386 4
 
< 0.1%
2.76 4
 
< 0.1%
90 4
 
< 0.1%
3.4161 4
 
< 0.1%
Other values (17971) 18861
99.8%
ValueCountFrequency (%)
0 5
< 0.1%
0.0002 7
< 0.1%
0.0003 2
 
< 0.1%
0.001 1
 
< 0.1%
0.0031 1
 
< 0.1%
0.0056 1
 
< 0.1%
0.0086 1
 
< 0.1%
0.009 1
 
< 0.1%
0.0097 1
 
< 0.1%
0.0103 1
 
< 0.1%
ValueCountFrequency (%)
90 4
< 0.1%
89.9798 1
 
< 0.1%
89.9579 1
 
< 0.1%
89.9535 1
 
< 0.1%
89.9528 1
 
< 0.1%
89.9229 1
 
< 0.1%
89.9155 1
 
< 0.1%
89.9087 1
 
< 0.1%
89.9076 1
 
< 0.1%
89.9042 1
 
< 0.1%

fDist
Real number (ℝ)

Distinct18437
Distinct (%)97.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean193.71255
Minimum1.2826
Maximum495.561
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size295.4 KiB
2022-11-26T13:15:26.449079image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Quantile statistics

Minimum1.2826
5-th percentile71.41204
Q1142.269
median191.832
Q3240.409
95-th percentile326.5014
Maximum495.561
Range494.2784
Interquartile range (IQR)98.14

Descriptive statistics

Standard deviation74.685712
Coefficient of variation (CV)0.38554916
Kurtosis-0.11237649
Mean193.71255
Median Absolute Deviation (MAD)49.037
Skewness0.22878544
Sum3662135.8
Variance5577.9556
MonotonicityNot monotonic
2022-11-26T13:15:26.781129image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
209.954 3
 
< 0.1%
195.287 3
 
< 0.1%
116.737 3
 
< 0.1%
246.013 3
 
< 0.1%
186.828 3
 
< 0.1%
295.34 3
 
< 0.1%
185.909 3
 
< 0.1%
216.032 3
 
< 0.1%
185.927 3
 
< 0.1%
100.395 3
 
< 0.1%
Other values (18427) 18875
99.8%
ValueCountFrequency (%)
1.2826 1
< 0.1%
5.5449 1
< 0.1%
5.5922 1
< 0.1%
5.6998 1
< 0.1%
5.7456 1
< 0.1%
6.564 1
< 0.1%
6.6852 1
< 0.1%
9.1574 1
< 0.1%
13.1108 1
< 0.1%
14.0229 1
< 0.1%
ValueCountFrequency (%)
495.561 1
< 0.1%
466.4078 1
< 0.1%
450.953 1
< 0.1%
450.402 1
< 0.1%
450.349 1
< 0.1%
448.0295 1
< 0.1%
446.488 1
< 0.1%
438.901 1
< 0.1%
438.8574 1
< 0.1%
437.477 1
< 0.1%

class
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size295.4 KiB
g
12332 
h
6573 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters18905
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowg
2nd rowg
3rd rowg
4th rowg
5th rowg

Common Values

ValueCountFrequency (%)
g 12332
65.2%
h 6573
34.8%

Length

2022-11-26T13:15:27.059371image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2022-11-26T13:15:27.294441image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
ValueCountFrequency (%)
g 12332
65.2%
h 6573
34.8%

Most occurring characters

ValueCountFrequency (%)
g 12332
65.2%
h 6573
34.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 18905
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
g 12332
65.2%
h 6573
34.8%

Most occurring scripts

ValueCountFrequency (%)
Latin 18905
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
g 12332
65.2%
h 6573
34.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 18905
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
g 12332
65.2%
h 6573
34.8%

Interactions

2022-11-26T13:15:14.600930image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-26T13:14:49.932647image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-26T13:14:52.748764image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-26T13:14:55.049209image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-26T13:14:58.161591image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-26T13:15:00.823203image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-26T13:15:03.394797image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-26T13:15:05.935457image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-26T13:15:08.607145image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-26T13:15:11.887311image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-26T13:15:14.857851image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-26T13:14:50.222732image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-26T13:14:52.995703image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-26T13:14:55.322224image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-26T13:14:58.453548image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-26T13:15:01.066638image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-26T13:15:03.665362image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-26T13:15:06.177550image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-26T13:15:08.869732image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-26T13:15:12.196956image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-26T13:15:15.072549image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-26T13:14:50.463166image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-26T13:14:53.221245image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-26T13:14:55.565559image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-26T13:14:58.704261image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-26T13:15:01.310529image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-26T13:15:03.890064image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-26T13:15:06.425263image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-26T13:15:09.102110image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-26T13:15:12.506810image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-26T13:15:15.306430image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-26T13:14:50.702441image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-26T13:14:53.459014image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-26T13:14:55.812442image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-26T13:14:58.967323image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-26T13:15:01.577197image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-26T13:15:04.134711image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-26T13:15:06.697754image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-26T13:15:09.351532image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-26T13:15:12.855124image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-26T13:15:15.540557image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-26T13:14:50.919766image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-26T13:14:53.682229image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-26T13:14:56.102864image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-26T13:14:59.220612image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-26T13:15:01.822771image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-26T13:15:04.377149image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-26T13:15:06.928058image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-26T13:15:09.592789image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-26T13:15:13.165106image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-26T13:15:15.785219image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-26T13:14:51.169460image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-26T13:14:53.923971image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-26T13:14:56.414464image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-26T13:14:59.474328image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-26T13:15:02.085726image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-26T13:15:04.627119image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-26T13:15:07.195863image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-26T13:15:09.911856image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-26T13:15:13.435094image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-26T13:15:16.253690image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-26T13:14:51.423935image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-26T13:14:54.147436image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-26T13:14:56.742537image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-26T13:14:59.700770image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-26T13:15:02.341141image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-26T13:15:04.890049image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-26T13:15:07.425941image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-26T13:15:10.516217image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-26T13:15:13.683320image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-26T13:15:16.684654image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-26T13:14:51.654210image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-26T13:14:54.377696image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-26T13:14:57.051535image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-26T13:14:59.957960image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-26T13:15:02.591181image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-26T13:15:05.218703image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-26T13:15:07.724954image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-26T13:15:10.887919image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-26T13:15:13.912936image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-26T13:15:17.074929image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-26T13:14:51.912551image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-26T13:14:54.628306image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-26T13:14:57.364131image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-26T13:15:00.199448image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-26T13:15:02.850524image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-26T13:15:05.470374image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-26T13:15:08.080252image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-26T13:15:11.234122image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-26T13:15:14.169181image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-26T13:15:17.411039image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-26T13:14:52.422131image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-26T13:14:54.832661image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-26T13:14:57.798588image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-26T13:15:00.430105image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-26T13:15:03.091535image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-26T13:15:05.700973image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-26T13:15:08.366647image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-26T13:15:11.547034image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-11-26T13:15:14.368070image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Correlations

2022-11-26T13:15:27.514643image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Auto

The auto setting is an interpretable pairwise column metric of the following mapping:
  • Variable_type-Variable_type : Method, Range
  • Categorical-Categorical : Cramer's V, [0,1]
  • Numerical-Categorical : Cramer's V, [0,1] (using a discretized numerical column)
  • Numerical-Numerical : Spearman's ρ, [-1,1]
The number of bins used in the discretization for the Numerical-Categorical column pair can be changed using config.correlations["auto"].n_bins. The number of bins affects the granularity of the association you wish to measure.

This configuration uses the recommended metric for each pair of columns.
2022-11-26T13:15:27.869603image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2022-11-26T13:15:28.195918image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2022-11-26T13:15:28.608440image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2022-11-26T13:15:28.927464image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2022-11-26T13:15:17.932986image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
A simple visualization of nullity by column.
2022-11-26T13:15:18.628319image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

fLengthfWidthfSizefConcfConc1fAsymfM3LongfM3TransfAlphafDistclass
028.796716.00212.64490.39180.198227.700422.0110-8.202740.092081.8828g
131.603611.72352.51850.53030.377326.272223.8238-9.95746.3609205.2610g
2162.0520136.03104.06120.03740.0187116.7410-64.8580-45.216076.9600256.7880g
323.81729.57282.33850.61470.392227.2107-6.4633-7.151310.4490116.7370g
475.136230.92053.16110.31680.1832-5.527728.552521.83934.6480356.4620g
551.624021.15022.90850.24200.134050.876143.18879.81453.6130238.0980g
648.246817.35653.03320.25290.15158.573038.095710.58684.7920219.0870g
726.789713.75952.55210.42360.217429.633920.4560-2.92920.8120237.1340g
896.232746.51654.15400.07790.0390110.355085.048643.18444.8540248.2260g
946.761915.19932.57860.33770.191324.754843.8771-6.68127.8750102.2510g
fLengthfWidthfSizefConcfConc1fAsymfM3LongfM3TransfAlphafDistclass
1901032.490210.67232.47420.46640.2735-27.0097-21.16878.481369.1730120.6680h
1901179.552844.99293.54880.16560.0900-39.621353.7866-30.005415.8075311.5680h
1901231.837313.87342.82510.41690.1988-16.4919-27.144811.109811.3663100.0566h
19013182.500376.55683.68720.11230.0666192.267593.0302-62.619282.1691283.4731h
1901443.298017.35452.83070.28770.1646-60.1842-33.8513-3.654578.4099224.8299h
1901521.384610.91702.61610.58570.393415.261811.52452.87662.4229106.8258h
1901628.94526.70202.26720.53510.278437.081613.1853-2.963286.7975247.4560h
1901775.445547.53053.44830.14170.0549-9.356141.0562-9.466230.2987256.5166h
19018120.513576.90183.99390.09440.06835.8043-93.5224-63.838984.6874408.3166h
19019187.181453.00143.20930.28760.1539-167.3125-168.455831.475552.7310272.3174h